Predicting the distributions of Scleroderma guani (Hymenoptera: Bethylidae) under climate change in China

Abstract The wasp Scleroderma guani is an important parasitic natural enemy of a variety of stem borers such as longicorn beetles. Studying and clarifying the suitable area of this wasp plays an important role in controlling stem borers. Based on information about the actual distribution of S. guani and on a set of environmental variables, MaxEnt niche model and ArcGIS were exploited to predict the potential distribution of this insect in China. This work simulated the geographical distribution of potential climatic suitability of S. guani in China at present and in different periods in the future. Combining the relative percent contribution score of environmental factors and the Jackknife test, the dominant environmental variables and their appropriate values restricting the potential geographical distribution of S. guani were screened. The results showed that the prediction of the MaxEnt model was highly in line with the actual distribution under current climate conditions, and the simulation accuracy was very high. The distribution of S. guani is mainly affected by bio18 (Precipitation of Warmest Quarter), bio11 (Mean Temperature of Coldest Quarter), bio13 (Precipitation of Wettest Month), and bio3 (Isothermality). The suitable habitat of S. guani in China is mainly distributed in the Northeast China Plain, North China Plain, middle‐lower Yangtze Plain, and Sichuan Basin, with total suitable area of 547.05 × 104 km2, accounting for 56.85% of China’s territory. Furthermore, under the RCP2.6, RCP4.5, and RCP8.5 climate change scenarios in the 2050s and 2090s, the areas of high, medium, and low suitability showed different degrees of change compared to nowadays, exhibiting expansion trend in the future. This work provides theoretical support for related research on pest control and ecological protection.


| INTRODUC TI ON
Scleroderma guani (Hymenoptera: Bethylidae), an ectoparasitic wasp, is a natural enemy insect that takes the larvae and pupae of Coleoptera, Lepidoptera, and other insects (particularly long-horned beetles) as hosts (Zheng et al., 2015). It is widely distributed in China, including Hebei, Shandong, Henan, Guangdong, Hunan, and Jiangsu provinces (Hu et al., 2014). The parasite was first discovered in 1973 and successfully reproduced indoors for the first time in 1977 (Zhang, 2018). The occurrence of S. guani is related to the host distribution and is affected by different climate-related variables, especially temperature and rainfall. It is an effective natural enemy for the prevention and control of borer pests such as long-horn beetles, buprestid beetles, and some engraver beetles. In nature, if adult wasps locate their hosts and lay eggs smoothly on the surface, this means that parasitism begins. In addition, the insect is an inhibitory parasitic wasp (Luo & Li, 2018;Zhang et al., 2015). Before laying eggs, the female will stab the host, inject the venom to make it anesthetized, and then lay eggs on the host with no resistance (Yao & Yang, 2008). The female S. guani also has a stronger ability to hunt and attack hosts. Scleroderma guani has a high parasitic rate against small-and medium-sized boring pests (Zhang et al., 2015). Generally, one generation of S. guani, involving egg, larva, pupa, and adult, will be completed in about a month at 25°C . This wasp has a valuable biological control agent.
Ecological Niche Models (ENMs) infer the relationship between species distribution and environmental variables by relating the information on a sample of occurrence data of the target species with the values of the environmental variables on the sample localities and adopt this relationship to estimate the distribution of regions that satisfy the niche requirements of target species (Hutchinson, 1965;Peterson et al., 2011;Zhu et al., 2013), then regarding those areas as parts of the potential distribution. ENMs are crucial tools for ecological research (Booth, 2018). In the past few decades, ENMs have been widely applied to study the distribution of species. Many studies have demonstrated that the MaxEnt model has certain advantages in terms of prediction accuracy, particularly in the case of fewer target species distribution data (Phillips et al., 2006;Saatchi et al., 2008;Yi et al., 2017). Zhang et al. (2016) compared the prediction accuracy of 4 commonly used niche models, and the results showed that MaxEnt model had better prediction effect. Elith et al. (2006) compared the prediction ability of various niche models and concluded that MaxEnt had the highest prediction ability among 16 models. Consequently, MaxEnt was selected as the simulation software in this study. The MaxEnt model has the characteristics of being relatively convenient to use and only requires a small sample size (Ma & Sun, 2018). Since Phillips proposed this model, MaxEnt has been commonly applied in the assessment of potential distribution of species (Zhou et al., 2016), the protection of endangered plant and animal (Zheng et al., 2016), the risk evaluation of species invasion (Rodríguez-Merino et al., 2018), the assessment of pest and disease spread and control (Zaidi et al., 2016), and good simulation outcomes were obtained. The Maxent model was used to predict the suitable areas of insects under current and future climate conditions, which can clarify the impact of climate change on insect distribution and provide a certain basis for further research on insects (Huang et al., 2020;Zhao & Shi, 2019).
In this work, the MaxEnt and ArcGIS technologies were used to analyze the environmental suitability of S. guani, based on known distribution data and combined with environmental data in China.
Predicting the current and future potential distribution of S. guani will provide theoretical basis for pest control, particularly stem borers.

| Species data sources and processing
The crucial prerequisite for building a niche model is that there should be enough existing species records (Zhang et al., 2019).

Scleroderma guani first appeared in Guangdong province in 1973 and
Shandong province in 1975. Soon afterwards, it was discovered in many provinces of China (Zhang et al., 1987). By querying the records of the Global Biodiversity Information Facility (GBIF, https://www. gbif.org/), consulting published relevant literature and books, and integrating with GPS field survey data, the statistics of the natural distribution points of S. guani were obtained. The records were converted to uniform latitude and longitude coordinates (refer to WGS84 geographic coordinates system); the latitude and longitude data of S. guani distribution points were confirmed by Google Earth online (http://www.earth ol.com/). The collected distribution points of S. guani were imported into ArcGIS software. Buffer analysis method was used to screen the obtained distribution points to exclude the influence of over-fitting simulation caused by large spatial correlation. Since the spatial resolution of environmental variables was 2.5 arc-min (about 4.5 km 2 ), the buffer radius was set to 1.5 km. When the distance between the distribution points is <3 km, only one point was retained. Ultimately, a total of 124 valid sites were obtained (Figure 1); these records were exported as a CSV file for further model analysis.

| Environmental factors
The theoretical basis of the ENMs is the concept of ecological niche, which is defined as the position occupied by a population in an ecosystem in time and space and its relationship and role with other populations (Hutchinson, 1965). Environmental variables play an important role in the niche distribution of species .
In order to comprehensively explore the effects of climate on the spread of S. guani in China, the environmental variables considered in this work were extracted from the Worldclim database (Version 2.0, http://www.world clim.org/). Climatic variables used in the model are shown in Table 1 The global climate model used in this study was BCC-CSM1-1, and it was found that BCC-CSM1-1 has a good simulation effect on China's regional climate (Feng, 2012). We obtained future climate data for the 2050s (2041-2060) and 2090s (2081-2100) by the Climate Change, Agriculture and Food Security website (CCAFS, https://ccafs.cgiar.org/). The Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC) considered the four greenhouse gas concentration scenarios (Petersen, 2013). Zhang et al. (2014) showed that the RCP4.5 scheme has a higher priority than RCP6.0, so this study did not use the RCP6.0.
Three representative concentration pathways (RCPs), comprising the minimum greenhouse gas emission scenario (RCP2.6), the medium greenhouse gas emission scenario (RCP4.5), and the maximum greenhouse gas emission scenario (RCP8.5), were selected to simulate the distribution of the species in this work.  (Yan et al., 2020), and the variables which had low percent contribution (<1%) were removed (Zhu et al., 2014).

| MaxEnt modeling
After that, the Pearson correlation coefficient (r) was analyzed for the remaining climate variables by R software. According to widespread practice in Ecological Niche Modeling, highly correlated variables (|r| ≥ 0.8) were removed to exclude the effect of collinearity on model and further improved the veracity of the simulation (Xu et al., 2019).
Ultimately, the Maxent model was then refitted using only the highly contributing and uncorrelated environmental predictors.
Then, the distribution territory of S. guani in China was extracted by ArcGIS, and the climatic suitability of the wasp was analyzed. The output of the MaxEnt software simulation ranged from 0 to 1, and the value closer to 1 meant the higher the possibility of species presence (Wang et al., 2017). Referring to the IPCC report on the assess-

| Model optimization and evaluation
In this research, the default parameters of the MaxEnt software were RM = 1, FC = LQHPT. In R software, two parameters of regularization multiplier (RM) and feature classes (FC) were adjusted by calling ENMeval package to optimize MaxEnt model (Kass et al., 2021;Zhou et al., 2016). The MaxEnt model provided five features, which were L(linear), Q(quadratic), H(hinge), P(product), and T(threshold), and they can generate 31 different combinations. The RM parameter was set from 0.1 to 4, and the interval was 0.1, so that 40 RM values were evaluated. The ENMeval packet was used to test the above 1240 parameter combinations. AUC DIFF (the difference between training set AUC and test set AUC) and test omission rate were used to test the fit of the model to species distribution. The closer the test omission rate to the theoretical omission rate, the higher the accuracy of the constructed model is (Shcheglovitova & Anderson, 2013). Akaike information criterion (AIC) was used to evaluate the fitting degree and complexity of different parameter combinations. The parameter combinations leading to the lowest AICc value (ΔAICc = 0) were selected to fit the optimized model (Jia et al., 2019).
After the optimization, the optimal parameters were used to simulate and predict the suitable habitat of S. guani in different periods. The accuracy of the simulation results was evaluated using the receiver operating characteristic curve (ROC), and the area under the curve (AUC) was used to evaluate the predictive performance of the model (Na et al., 2018). The value of AUC changes between 0 and 1, and the value of AUC is <0.8 means the simulation result has low reliability, and the AUC value is in the range of 0.8-0.9, which means the simulation result is more accurate. The AUC value is in the range of 0.9-1, which means the simulation result is very accurate

| Model performance and Key environment variables
Combined with the percent contribution and Pearson's correlation coefficient, the eight key environmental variables were screened out, and the species distribution model was reconstructed. The Pearson correlation coefficients of the above-mentioned eight environmental variables are shown in Table 3. The results illustrated that except for |r| = 0.851 of both bio3 and bio4, these values were lower than 0.8. As the percent contribution of bio3 and bio4 in this study were relatively large, and these two factors have a great impact on the distribution of many insects (Wang et al., 2017;Xu, Liu, et al., 2021;Xu, Tang, et al., 2021;Zhao & Shi, 2019), so they are retained. As shown in Figure 2a, the prediction omission rate showed a high agreement with the test sample omission rate, which indicated the good prediction effect of the model. Figure 2b showed the ROC curve of the model and exhibited that the AUC value reached 0.988, which indicated the model's prediction accuracy was excellent. This model was reliable for confirming the potential distribution of S. guani in China.

| Predicting the current distribution of S. guani in China
Projection of the suitability for S. guani across China, according to the optimized Maxent model, is shown in Figure 3. The statistics for the predicted areas of S. guani in different provinces are displayed in As shown in Table 5 and Figure 5, under the RCP2.6 scenario, model projections suggest that by the 2090s, the high suitability area will decrease the most compared to current conditions, and the reduced areas will be 17.64 × 10 4 km 2 , accounting for 19.65% of the current predicted one. From now to the 2050s and the 2090s, there will be a trend of transformation from high suitable area to medium suitable. Among them, the highly suitable areas will have significantly declined in the provinces of Hubei, Anhui, Jiangsu, and Liaoning. Under the RCP4.5 scenario, the extent will convert low suitable areas to medium and high suitable in the 2050s. The extent of high suitable area will rise by 11.06 × 10 4 km 2 in 2050s and fall by 2.58 × 10 4 km 2 in 2090s, accounting for 12.33% and 2.87% of the current predicted one, respectively. From present to the 2050s and to the 2090s, the moderate suitable area increased in extent, with the percentage rising by 2.91% and 2.45%, respectively. Many currently low suitability areas in the provinces of Fujian, Zhejiang, Jiangxi, and Hunan will turn to medium suitability. Under the RCP8.5 scenario, the high suitable areas will increase by 20.15 × 10 4 km 2 and 2.74 × 10 4 km 2 from current to the 2050s and the 2090s, respectively. They account for 22.46% and 3.06% of the current, respectively. The significantly increased high suitability areas will mainly be distributed in Guangxi, Guizhou, and Hunan provinces.

| Relationship between environmental variables and geographical distribution
The selected environmental variables that have a noticeable impact on the distribution of S. guani were analyzed by Jackknife test method. As shown in Figure 6, all the predictors affected the potential distribution of S. guani to some extent, with bio18 being the most important when used alone. The blue band represented the importance of the variable to species distribution in Figure 6. The longer the band, the more important the variable was to species distribution. Figure 7  According to the response curve of environmental variables to distribution probability in the MaxEnt model (Figure 7), the appropriate range of environmental variables to the potential distribution of S. guani was determined, as shown in When the mean temperature of coldest quarter (bio11) was −11.5°C-14.9°C, the predicted suitability of S. guani was higher than 0.33, and the predicted suitability was the highest at 2°C, reaching 0.63. The small range of appropriate values for bio11 suggested that S. guani is highly sensitive to extreme temperature changes. When the precipitation of wettest month (bio13) was lower than 120.0 mm, the suitability of predicted S. guani was lower than 0.33. With the increase in precipitation, the suitability of prediction increased quickly and reached the peak at 175.1 mm. When the precipitation exceeded 364.5 mm, the suitability of predicted dropped again below 0.33. A slight change in bio3 can have a significant effect on the distribution of S. guani, suggesting that it preferred areas with less temperature variation. The appropriate range of the response curve for isothermality (bio3) was 23.4%-36.9%, and the most appropriate value was 29.6%. The suitable values range of bio15, alt, bio4, and bio19 are all shown in Table 6.

| DISCUSS ION
The MaxEnt model was used to simulate the potential geographical distribution of S. guani in China, and the results showed that the highly suitable areas were mainly located in Sichuan, Hebei, Shandong, Jiangsu, Guangdong, Beijing, Chongqing, and other regions, which was consistent with relevant previous research on this species (Xiao & Wu, 1983;Zheng et al., 2015), and the predicted suitable distribution ranges were broader in this study. MaxEnt F I G U R E 6 Importance of environmental variables to Scleroderma guani by Jackknife test.
advantages and has been widely used in China and abroad ). The MaxEnt model was evaluated by using Kappa, TSS, ROC curves, and the AUC values in this study. The results indicated that the model has a particularly good prediction effect on the distribution area of S. guani and has a very high reliability.
In this research, the most important environmental variables limiting S. guani distribution, which included bio18, bio11, bio13, and bio3, were screened using Jackknife test combined with Pearson's correlation coefficient, and the result indicated that precipitation and temperature jointly constrained the current distribution pattern of S. guani. Scleroderma guani has a relatively high tolerance to humidity and can develop normally under the condition of relative humidity of 40%-90% (Yao et al., 1983). The range of bio18 from 302.34 mm to 1784.19 mm was suitable for S. guani occurrence in this research, and the ranges of bio13 and bio19 were also relatively wide, which was consistent with the results of predecessors (Yao et al., 1983). Wang et al. (2004) discovered that the parasitism rate of S. guani is inversely related to temperature. Li et al. (1984) revealed that the reproductive cycle and survival rate of S. guani were markedly diverse under different temperature and humidity. Temperature and humidity are closely related to the growth and development of S. guani (Zhao, 2019 (Yao et al., 1983). The starting temperatures of egg, larva, and pupa are 60.18, 169.71, and 219.00 day degrees, respectively (Yao et al., 1983). The temperature range of artificial reproduction in the room of S. guani is 22-28°C, the optimum temperature is 26°C, the relative humidity is 60%-80%; in this temperature and humidity interval, the vaccination success rate and spawned volume of S. guani are higher (Zhou et al., 2005).
All the above results indicated that temperature and precipitation play a key role in S. guani. This study showed that the suitable distribution range of bio11 was −11.5°C-14.9°C, and the altitude F I G U R E 7 Response curves between environmental variables and predicted suitability. a-d were bio18, bio11, bio13, and bio3, respectively.

TA B L E 6
Suitable range of environmental variables for potential distribution of Scleroderma guani. above 1602.7 m will not be suitable for the distribution of S. guani.

Environmental variables
Studies have found that the adults and pupae of S. guani can withstand the low temperature of −24°C and can overwinter in the area at an altitude of 1200-1450 m, but above 1700 m, S. guani cannot survive the winter due to the low temperature (Chen & Cheng, 2000). This is consistent with the results of this research.  (Table 5).
In the future, the suitable range in Qinghai Tibet region will decline, while that in the northwest of China will increase. Climate warming is driving the expansion of S. guani suitable habitat, and it is anticipated that the suitable habitat will shift to higher altitude and higher latitude in the future.
As the dominant natural enemy of stem borers, S. guani has been widely used in biological control (Yang et al., 2014;Zhang et al., 2021). Large-scale and low-cost breeding of S. guani has become one of the research hot topics in the field of biological control applications (Liu et al., 2011). The suitable distribution area of S. guani was relatively extensive, and the distribution of the host will affect its distribution. There are many kinds of hosts for the wasp, but Monochamus alternatus and Batocera horsfieldi, which have caused great economic losses, are its main hosts in China (Chen et al., 2008;Zhou et al., 2020). The distribution of hosts was obtained by surveying relevant literature, books, and the GBIF website, and it is shown in Figure 8. The ENMs only describe the basic ecological requirements of species, not the actual ecological requirements. When ENMs are conducted to predict the potential distribution of species, a variety of biological and non-biological factors affecting the distribution of species are easily overlooked Xu, Tang, et al., 2021). Up to now, ROC curve analysis has been widely adopted in the evaluation of discrimination performance when modeling the potential distribution of species, and it reveals the performance of the MaxEnt model (Zhang et al., 2021). The MaxEnt model has general advantages in predicting the potential distribution of species, but still has some limitations (Xu et al., 2019). The feedback curve only shows the influence of a single environmental factor, ignoring the interaction between variables. It is impractical to consider all environmental factors comprehensively in a particular model analysis, so it may be more efficient to treat the model as a base niche model (Chakraborty et al., 2016). Distribution and modeling results are also influenced by other intrinsic factors (distance and rate of dispersal of species and time of formation) and extrinsic factors (human activities and natural enemies) . The work employed limited occurrence data and considered only the environmental factors associated with temperature and rainfall and did not take into account the influence of biological factors such as host distribution and diffusion, predators, and other environmental factors such as human interference, which may have an impact on the accuracy of the prediction results. Hereafter, biological and non-biological factors such as human activities and host types can be incorporated into the model when studying the suitable area of S. guani in order to improve the accuracy of model predictions.

| CON CLUS IONS
This research applied MaxEnt model and ArcGIS technology to successfully calculate the current and future suitable habitat distribution of S. guani in China, according to the known distribution information and climate factors. The results revealed that the suitable areas were distributed in low-altitude areas, and the high suitable areas were mainly concentrated in the coastal area of northeast plain, North China plain and Sichuan Basin. The vital environmental variables that impacted the distribution were precipitation of warmest quarter (bio18), mean temperature of coldest quarter (bio11), precipitation of wettest month (bio13), and isothermality (bio3). The distribution range of S.
guani in high suitable areas showed a trend of further expansion. This study will provide reference for expanding current knowledge about the environmental drivers of S. guani distribution, so as to facilitate its use as biological control agent against stem borers and other pest species. This study explored the impact of the Maxent model on the ecological distribution of S. guani, and further research will be carried out in the future combining more ENM and more environmental factors.

ACK N OWLED G M ENTS
The authors thank Xianchun Yan (China West Normal University) for his helpful assistance.

CO N FLI C T O F I NTE R E S T
None.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data supporting the results are available in a public repository at: https://doi.org/10.6084/m9.figsh are.19344 893.v2.